{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 5.2 案例实战：员工离职预测模型搭建"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**5.2.1 模型搭建**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1.数据读取与预处理\n",
    "\n",
    "首先读取员工信息以及其交易离职表现，即是否离职记录，代码如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'1.5.3'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.__version__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>工资</th>\n",
       "      <th>满意度</th>\n",
       "      <th>考核得分</th>\n",
       "      <th>工程数量</th>\n",
       "      <th>月工时</th>\n",
       "      <th>工龄</th>\n",
       "      <th>离职</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>低</td>\n",
       "      <td>3.8</td>\n",
       "      <td>0.53</td>\n",
       "      <td>2</td>\n",
       "      <td>157</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>中</td>\n",
       "      <td>8.0</td>\n",
       "      <td>0.86</td>\n",
       "      <td>5</td>\n",
       "      <td>262</td>\n",
       "      <td>6</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>中</td>\n",
       "      <td>1.1</td>\n",
       "      <td>0.88</td>\n",
       "      <td>7</td>\n",
       "      <td>272</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>低</td>\n",
       "      <td>7.2</td>\n",
       "      <td>0.87</td>\n",
       "      <td>5</td>\n",
       "      <td>223</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>低</td>\n",
       "      <td>3.7</td>\n",
       "      <td>0.52</td>\n",
       "      <td>2</td>\n",
       "      <td>159</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>低</td>\n",
       "      <td>4.1</td>\n",
       "      <td>0.50</td>\n",
       "      <td>2</td>\n",
       "      <td>153</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>低</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.77</td>\n",
       "      <td>6</td>\n",
       "      <td>247</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>低</td>\n",
       "      <td>9.2</td>\n",
       "      <td>0.85</td>\n",
       "      <td>5</td>\n",
       "      <td>259</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>低</td>\n",
       "      <td>8.9</td>\n",
       "      <td>1.00</td>\n",
       "      <td>5</td>\n",
       "      <td>224</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>低</td>\n",
       "      <td>4.2</td>\n",
       "      <td>0.53</td>\n",
       "      <td>2</td>\n",
       "      <td>142</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  工资  满意度  考核得分  工程数量  月工时  工龄  离职\n",
       "0  低  3.8  0.53     2  157   3   1\n",
       "1  中  8.0  0.86     5  262   6   1\n",
       "2  中  1.1  0.88     7  272   4   1\n",
       "3  低  7.2  0.87     5  223   5   1\n",
       "4  低  3.7  0.52     2  159   3   1\n",
       "5  低  4.1  0.50     2  153   3   1\n",
       "6  低  1.0  0.77     6  247   4   1\n",
       "7  低  9.2  0.85     5  259   5   1\n",
       "8  低  8.9  1.00     5  224   5   1\n",
       "9  低  4.2  0.53     2  142   3   1"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "df = pd.read_excel('员工离职预测模型.xlsx')\n",
    "df.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "处理文本内容，代码如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>工资</th>\n",
       "      <th>满意度</th>\n",
       "      <th>考核得分</th>\n",
       "      <th>工程数量</th>\n",
       "      <th>月工时</th>\n",
       "      <th>工龄</th>\n",
       "      <th>离职</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>3.8</td>\n",
       "      <td>0.53</td>\n",
       "      <td>2</td>\n",
       "      <td>157</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>8.0</td>\n",
       "      <td>0.86</td>\n",
       "      <td>5</td>\n",
       "      <td>262</td>\n",
       "      <td>6</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>1.1</td>\n",
       "      <td>0.88</td>\n",
       "      <td>7</td>\n",
       "      <td>272</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>7.2</td>\n",
       "      <td>0.87</td>\n",
       "      <td>5</td>\n",
       "      <td>223</td>\n",
       "      <td>5</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>3.7</td>\n",
       "      <td>0.52</td>\n",
       "      <td>2</td>\n",
       "      <td>159</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   工资  满意度  考核得分  工程数量  月工时  工龄  离职\n",
       "0   0  3.8  0.53     2  157   3   1\n",
       "1   1  8.0  0.86     5  262   6   1\n",
       "2   1  1.1  0.88     7  272   4   1\n",
       "3   0  7.2  0.87     5  223   5   1\n",
       "4   0  3.7  0.52     2  159   3   1"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = df.replace({'工资': {'低': 0, '中': 1, '高': 2}})\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2.提取特征变量和目标变量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = df.drop(columns='离职') # 删除离职列\n",
    "y = df['离职']  # 目标变量"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3.划分训练集和测试集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "4.模型训练及搭建"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>#sk-container-id-1 {color: black;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>DecisionTreeClassifier(max_depth=3, random_state=123)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">DecisionTreeClassifier</label><div class=\"sk-toggleable__content\"><pre>DecisionTreeClassifier(max_depth=3, random_state=123)</pre></div></div></div></div></div>"
      ],
      "text/plain": [
       "DecisionTreeClassifier(max_depth=3, random_state=123)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.tree import DecisionTreeClassifier\n",
    "model = DecisionTreeClassifier(max_depth=3, random_state=123) \n",
    "model.fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "       工资  满意度  考核得分  工程数量  月工时  工龄\n",
      "3553    0  7.3  0.93     5  162   4\n",
      "2112    0  4.3  0.52     2  160   3\n",
      "1794    0  3.8  0.51     2  159   3\n",
      "13886   0  6.3  0.71     4  244   2\n",
      "11251   1  8.8  0.71     5  219   2\n",
      "...    ..  ...   ...   ...  ...  ..\n",
      "5218    1  6.3  0.80     4  256   4\n",
      "12252   0  9.2  0.76     5  132   3\n",
      "1346    0  7.3  0.95     4  223   6\n",
      "11646   1  8.5  0.76     3  197   5\n",
      "3582    1  5.6  0.58     4  258   3\n",
      "\n",
      "[12000 rows x 6 columns]\n",
      "3553     1\n",
      "2112     1\n",
      "1794     1\n",
      "13886    0\n",
      "11251    0\n",
      "        ..\n",
      "5218     0\n",
      "12252    0\n",
      "1346     1\n",
      "11646    0\n",
      "3582     0\n",
      "Name: 离职, Length: 12000, dtype: int64\n",
      "       工资  满意度  考核得分  工程数量  月工时  工龄\n",
      "6958    0  8.4  0.93     6  166   4\n",
      "7534    1  9.2  0.99     3  190   3\n",
      "2975    1  7.4  0.91     4  232   5\n",
      "3903    1  8.2  0.83     3  271   2\n",
      "8437    1  2.3  0.88     5  238   6\n",
      "...    ..  ...   ...   ...  ...  ..\n",
      "1229    0  4.2  0.55     2  148   3\n",
      "10594   0  7.8  0.93     4  161   3\n",
      "13211   1  7.0  0.84     3  260   4\n",
      "3147    0  4.0  0.50     2  141   3\n",
      "6623    1  6.2  0.52     3  148   3\n",
      "\n",
      "[3000 rows x 6 columns]\n",
      "6958     0\n",
      "7534     0\n",
      "2975     1\n",
      "3903     0\n",
      "8437     0\n",
      "        ..\n",
      "1229     1\n",
      "10594    0\n",
      "13211    0\n",
      "3147     1\n",
      "6623     0\n",
      "Name: 离职, Length: 3000, dtype: int64\n"
     ]
    }
   ],
   "source": [
    "print(X_train)\n",
    "print(y_train)\n",
    "print(X_test)\n",
    "print(y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上述模型搭建代码汇总"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>#sk-container-id-3 {color: black;}#sk-container-id-3 pre{padding: 0;}#sk-container-id-3 div.sk-toggleable {background-color: white;}#sk-container-id-3 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-3 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-3 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-3 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-3 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-3 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-3 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-3 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-3 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-3 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-3 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-3 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-3 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-3 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-3 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-3 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-3 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-3 div.sk-item {position: relative;z-index: 1;}#sk-container-id-3 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-3 div.sk-item::before, #sk-container-id-3 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-3 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-3 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-3 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-3 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-3 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-3 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-3 div.sk-label-container {text-align: center;}#sk-container-id-3 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-3 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-3\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>DecisionTreeClassifier(max_depth=3, random_state=123)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-3\" type=\"checkbox\" checked><label for=\"sk-estimator-id-3\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">DecisionTreeClassifier</label><div class=\"sk-toggleable__content\"><pre>DecisionTreeClassifier(max_depth=3, random_state=123)</pre></div></div></div></div></div>"
      ],
      "text/plain": [
       "DecisionTreeClassifier(max_depth=3, random_state=123)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 1.读取数据与简单预处理\n",
    "import pandas as pd\n",
    "df = pd.read_excel('员工离职预测模型.xlsx')\n",
    "df = df.replace({'工资': {'低': 0, '中': 1, '高': 2}})\n",
    "\n",
    "# 2.提取特征变量和目标变量\n",
    "X = df.drop(columns='离职') \n",
    "y = df['离职']   \n",
    "\n",
    "# 3.划分训练集和测试集\n",
    "from sklearn.model_selection import train_test_split\n",
    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123)\n",
    "\n",
    "# 4.模型训练及搭建\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
    "model = DecisionTreeClassifier(max_depth=3, random_state=123)\n",
    "model.fit(X_train, y_train)    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**5.2.2 模型预测及评估**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**1.直接预测是否离职**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 1 0\n",
      " 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0\n",
      " 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 1 1 1 0 0 0]\n"
     ]
    }
   ],
   "source": [
    "y_pred = model.predict(X_test)\n",
    "print(y_pred[0:100])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>预测值</th>\n",
       "      <th>实际值</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    预测值  实际值\n",
       "0     0    0\n",
       "1     0    0\n",
       "2     1    1\n",
       "3     0    0\n",
       "4     0    0\n",
       "5     0    0\n",
       "6     1    1\n",
       "7     0    0\n",
       "8     0    0\n",
       "9     0    0\n",
       "10    0    0\n",
       "11    0    0\n",
       "12    0    0\n",
       "13    0    0\n",
       "14    0    0\n",
       "15    1    1\n",
       "16    0    0\n",
       "17    1    1\n",
       "18    0    0\n",
       "19    0    0"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 通过构造DataFrame进行对比\n",
    "a = pd.DataFrame()  # 创建一个空DataFrame \n",
    "a['预测值'] = list(y_pred) # y_pred为一维数组，用list函数将其转化为列表\n",
    "a['实际值'] = list(y_test) # y_test为一位序列，用list函数将其转化为列表\n",
    "a.head(20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.9573333333333334\n"
     ]
    }
   ],
   "source": [
    "# 如果要查看整体的预测准确度，可以采用如下代码：\n",
    "from sklearn.metrics import accuracy_score\n",
    "score = accuracy_score(y_pred, y_test)\n",
    "print(score)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9573333333333334"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 或者用模型自带的score函数查看预测准确度\n",
    "model.score(X_test, y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**2.预测不离职&离职概率**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "其实分类决策树模型本质预测的并不是准确的0或1的分类，而是预测其属于某一分类的概率，可以通过如下代码查看预测属于各个分类的概率："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[0.98526077 0.01473923]\n",
      " [0.98526077 0.01473923]\n",
      " [0.28600613 0.71399387]\n",
      " [0.98526077 0.01473923]\n",
      " [0.92283214 0.07716786]]\n"
     ]
    }
   ],
   "source": [
    "y_pred_proba = model.predict_proba(X_test)\n",
    "print(y_pred_proba[0:5])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>不离职概率</th>\n",
       "      <th>离职概率</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.985261</td>\n",
       "      <td>0.014739</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.985261</td>\n",
       "      <td>0.014739</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.286006</td>\n",
       "      <td>0.713994</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.985261</td>\n",
       "      <td>0.014739</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.922832</td>\n",
       "      <td>0.077168</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      不离职概率      离职概率\n",
       "0  0.985261  0.014739\n",
       "1  0.985261  0.014739\n",
       "2  0.286006  0.713994\n",
       "3  0.985261  0.014739\n",
       "4  0.922832  0.077168"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = pd.DataFrame(y_pred_proba, columns=['不离职概率', '离职概率']) \n",
    "b.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果想查看离职概率，即查看y_pred_proba的第二列，可以采用如下代码，这个是二维数组选取列的方法，其中逗号前的“:”表示所有行，逗号后面的数字1则表示第二列，如果把数字1改成数字0，则提取第一列不离职概率。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.01473923, 0.01473923, 0.71399387, ..., 0.01473923, 0.94594595,\n",
       "       0.01473923])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y_pred_proba[:,1] # 所有行的第二列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**3.模型预测效果评估**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在Python实现上，通过4.3节讲过的代码就可以求出在不同阈值下的命中率（TPR）以及假警报率（FPR）的值，从而可以绘制ROC曲线。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.metrics import roc_curve\n",
    "fpr, tpr, thres = roc_curve(y_test, y_pred_proba[:,1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过4.3节相关代码可以查看不同阈值下的假警报率和命中率，代码如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>阈值</th>\n",
       "      <th>假警报率</th>\n",
       "      <th>命中率</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>inf</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.247110</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.945946</td>\n",
       "      <td>0.008232</td>\n",
       "      <td>0.677746</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.713994</td>\n",
       "      <td>0.038128</td>\n",
       "      <td>0.942197</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.077168</td>\n",
       "      <td>0.159879</td>\n",
       "      <td>0.969653</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.059406</td>\n",
       "      <td>0.171577</td>\n",
       "      <td>0.972543</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.045763</td>\n",
       "      <td>0.240035</td>\n",
       "      <td>0.976879</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.014739</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         阈值      假警报率       命中率\n",
       "0       inf  0.000000  0.000000\n",
       "1  1.000000  0.000000  0.247110\n",
       "2  0.945946  0.008232  0.677746\n",
       "3  0.713994  0.038128  0.942197\n",
       "4  0.077168  0.159879  0.969653\n",
       "5  0.059406  0.171577  0.972543\n",
       "6  0.045763  0.240035  0.976879\n",
       "7  0.014739  1.000000  1.000000"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = pd.DataFrame()  # 创建一个空DataFrame \n",
    "a['阈值'] = list(thres)\n",
    "a['假警报率'] = list(fpr)\n",
    "a['命中率'] = list(tpr)\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "已知了不同阈值下的假警报率和命中率，可通过matplotlib库可绘制ROC曲线，代码如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAiMAAAGdCAYAAADAAnMpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAApDElEQVR4nO3df2zUdb7v8dd0pp1CoWWhMBQotSgoLndxbYNSlmx0tQa9bEx2Y3PdK/6AExv1IPToiV1OdCEmze7Z5aKrwK6CxgQ9jb82nqSr9N6cxQqes4eecrMRcmFptSAttWVty6+2M/O5f3RmOtNOoTPtzIfpPB/JxPbb77fz6Tfsvl/z+Xy+n4/DGGMEAABgSYbtBgAAgPRGGAEAAFYRRgAAgFWEEQAAYBVhBAAAWEUYAQAAVhFGAACAVYQRAABglct2A8bC7/frzJkzmj59uhwOh+3mAACAMTDGqLe3V/PmzVNGxuj9HykRRs6cOaPCwkLbzQAAAHE4deqUFixYMOrPUyKMTJ8+XdLgH5Obm2u5NQAAYCx6enpUWFgYquOjSYkwEhyayc3NJYwAAJBirjbFggmsAADAKsIIAACwijACAACsIowAAACrCCMAAMAqwggAALCKMAIAAKwijAAAAKsIIwAAwKqYw8inn36qtWvXat68eXI4HPrDH/5w1WsOHDigkpISZWdna9GiRdq9e3c8bQUAAJNQzGHkwoULWr58uV555ZUxnd/S0qJ7771Xq1evVlNTk37+859r48aNev/992NuLAAAmHxi3ptmzZo1WrNmzZjP3717txYuXKgdO3ZIkpYuXarDhw/r17/+tX7yk5/E+vYAAGCSSfhGeZ9//rnKy8sjjt1zzz3as2ePBgYGlJmZOeKavr4+9fX1hb7v6elJdDMBALgm+f1GA36/vD6jAZ9f/T6/BnxGA16/vH6/+r2Dx4M/C543+H3YeYGvgz8bCDtvwGf005IFWjY/z8rfmPAw0t7eLo/HE3HM4/HI6/Wqs7NTBQUFI66pqanR1q1bE900AEAaChb3gfDi7B/6OljsvcMK//DzgsXeG174ff5A8TeD13qjF/5+39Dv8IZ/7zUjwoLXb5JyX24t+s7kDSPSyK2DjTFRjwdVV1erqqoq9H1PT48KCwsT10AAQNz8/qFiOuLT+/BP4d7IT+/h53mHFfSRIcFEL+DBIu6PLOgj32fwvXxJKu6J4nBImc4MZTkzlOl0KNOZEXiFfe3KUJbTIVfG0NfBn7mcjsC1wXMdyszI0A2zp1n7mxIeRubOnav29vaIYx0dHXK5XJo1a1bUa9xut9xud6KbBgDXJJ9/ZBHvD3zaDn4d/Jn3SoU/dG6UYh/13OAn8cGCPuqn92BPgX/yFPesYHF3ZciVMVi4s1xDBd7ljCzow4t/litwXsZgcc8a5bxMp0NZrsB5TkcgKEQLCcNCRth5zozoH+RTWcLDyMqVK/Wv//qvEcf279+v0tLSqPNFAGCiBYt7sNvc6zcRBX1E4Q99MjeBMfkrFPvAJ3GvP7Kgj/zdkV36A76hNngDn/6DxT7Fa7ucGQ65Mhyh4h4srFmBghsq4IFP5YOFeaigjyzGgU/5wwr6YPEf+jpqEQ+EisiAEXneZCzuqSbmMHL+/Hn99a9/DX3f0tKiI0eOaObMmVq4cKGqq6v19ddf66233pIkVVZW6pVXXlFVVZX+7u/+Tp9//rn27Nmjd955Z+L+CgBJY4wJFPdAt7h3WBEP/xQ9ShGP2qUfLOL+yHHz8UzI6w/0BkyG4h5e0Ed8ig4W8Yywr6MW/yif8qMU58zA73aFfZ0ZtfhHnkdxR7xiDiOHDx/WHXfcEfo+OLfj4Ycf1ptvvqm2tja1traGfl5cXKy6ujpt3rxZr776qubNm6eXX36Zx3qBgPDifqUx8Ohj8iPH44eK8LBP7xFd7WObkBfepvBuepPixT34KTlyjP0Kn94Dn8ojxuhdGcrMiCzo4Z/eRx2jd0Uv/KOel5GhDIo7JjmHMdf+/6309PQoLy9P3d3dys3Ntd0cXOOMMaGx9WARjzYGPqILPayrfXjxH/W6q0yUGzFG7w0U/2HBIdVlBifKOSO7zK84Bh6aZBfoQndFFvsRY/TDin/4p/xM5yhj9K6RxZ7iDiTPWOt3Up6mAaTBkNDn9etCn1cX+3260O/VhT6vLvT5dLE/7L/9Pl3sC/y3f+TP+0aMs4+ckJfqho+XD31/5THwEZ/eR0yyyxgKDq7hE/KGjdGPMpY/Yia+0zHqk3EAMBaEEYyZMUan/3ZJ/6+9V3+72B8KFBf7hv57vt8bESTCf3ah32tt7D68qz3UhT7KGPiIyW/hn6rDCnrUYj+8iz9shn3k5LyR12WGzqW4A0gvhBFE1Xt5QCc6zut4e6+OtfXoWFuvjrX3qPeyd0J+/5RMp3LcTk3NcinH7VJOllNTg//Ncg39LPy426WpmU65MzOiFv/wgh5e7CnuAHBtI4ykuQt93sHQcbZXJ8726vjZ8zpxtldnui9HPT/T6dANc6bLk+tWTpZLU7OcynGP/G9OlktT3c6Ic4KBYkqmk9n2AIAQwkiauNjv1V87zofCxvFA8Pj620ujXuPJdWuJZ7pumjtdSwtytbQgV9fPnqYsV8ybPQMAMCrCyCRzqd+nk9+cD4WNE2d7dbyjV6f/dmnUxzFnT3driWeaFs+ZriWe6aGv86ayKB0AIPEIIynq8sBg6DhxNix4dPSq9dzFUUNH/rSsQOCYpsWeoeAxY2pWchsPAEAYwsg1rs/rU/M3FwJzOgJzOzrO66uuC6M+mTIzJ0uL50wb6uUIBI+ZOYQOAMC1hzByjej3+tXSeSFiIunxjl591XVx1E2oZkzN1JI507XYMxg8gv/Nn8YmgwCA1EEYSbIBn19fdl4YDBtne3WiYzB4fNl5Qd5RQsf0bJdu9EwP9HAMBY/Z09w8sgoASHmEkQTx+vz6sutiRC/HibO9aum8MOoKodPdLt3gmRbR27HEM/gYLaEDADBZEUYmwKlzF/XFmZ7AkyuDT7A0f3Nh1D1HcrKcusEzXUvmRA6vFORlEzoAAGmHMDJO//voWW1463DUn03JdGpx6JHZoeAxf8YUQgcAAAGEkXE6eLJTkjR/xhTdVjwzYl7H/BlT2B0UAICrIIyMU/M3FyRJT915g/7HioWWWwMAQOphXe9xau48L0lalJ9juSUAAKQmwsg4XB7w6fTfBvd2WTR7muXWAACQmggj4/BV1+DS69OzXcqfxuqmAADEgzAyDs3fBIZoZk/j6RgAAOJEGBmH5s7ByavMFwEAIH6EkXEIPklDGAEAIH6EkXEIPUnD5FUAAOJGGImTMWaoZ2Q2PSMAAMSLMBKncxf61X1pQA6HVMwwDQAAcSOMxCk4eXVe3hRlZzottwYAgNRFGIlTC0M0AABMCMJInE6yDDwAABOCMBKnocmrPEkDAMB4EEbiNLT6Kj0jAACMB2EkDl6fX63nLkqiZwQAgPEijMTh1N8uacBnlJ2ZoYLcbNvNAQAgpRFG4tASmLx63awcZWSwQR4AAONBGIlDcPLq9QzRAAAwboSROJxkjREAACYMYSQOPEkDAMDEIYzEIbgU/KJ8hmkAABgvwkiMei8P6JvePklSMT0jAACMG2EkRi2BXpH8aW7lZmdabg0AAKmPMBKjZiavAgAwoQgjMQpOXr2eMAIAwIQgjMToJJNXAQCYUISRGDFMAwDAxCKMxMDvN6Gl4NkgDwCAiUEYiUFbz2VdHvDLleHQgu9Msd0cAAAmBcJIDFoCQzQLZ01VppNbBwDARKCixqA5OETD5FUAACYMYSQGQ7v1MnkVAICJQhiJwUk2yAMAYMIRRmIw9FgvwzQAAEwUwsgYXR7w6Uz3JUlScT49IwAATBTCyBh92XVBxki52S7Nysmy3RwAACYNwsgYhQ/ROBwOy60BAGDyIIyMUTOTVwEASAjCyBgNPdbL5FUAACYSYWSMgrv1MnkVAICJRRgZA2MMwzQAACQIYWQMui70q/eyVw6HdN0swggAABMprjCyc+dOFRcXKzs7WyUlJWpoaLji+fv27dPy5cs1depUFRQU6NFHH1VXV1dcDbYhOF9k/owpys50Wm4NAACTS8xhpLa2Vps2bdKWLVvU1NSk1atXa82aNWptbY16/meffaZ169Zp/fr1+uKLL/Tuu+/qP//zP7Vhw4ZxNz5ZhoZomLwKAMBEizmMbN++XevXr9eGDRu0dOlS7dixQ4WFhdq1a1fU8//93/9d1113nTZu3Kji4mL94Ac/0OOPP67Dhw+Pu/HJ0hyYvLqIyasAAEy4mMJIf3+/GhsbVV5eHnG8vLxchw4dinpNWVmZTp8+rbq6OhljdPbsWb333nu67777Rn2fvr4+9fT0RLxsYvIqAACJE1MY6ezslM/nk8fjiTju8XjU3t4e9ZqysjLt27dPFRUVysrK0ty5czVjxgz99re/HfV9ampqlJeXF3oVFhbG0swJF1p9NZ9hGgAAJlpcE1iHL4dujBl1ifSjR49q48aNev7559XY2KiPP/5YLS0tqqysHPX3V1dXq7u7O/Q6depUPM2cEAM+v1rPXZREzwgAAIngiuXk/Px8OZ3OEb0gHR0dI3pLgmpqarRq1So9++yzkqTvfe97ysnJ0erVq/Xiiy+qoKBgxDVut1tutzuWpiXMqXMX5fUbTcl0am5utu3mAAAw6cTUM5KVlaWSkhLV19dHHK+vr1dZWVnUay5evKiMjMi3cToHH481xsTy9lYEh2iK83OUkcEGeQAATLSYh2mqqqr0+uuva+/evTp27Jg2b96s1tbW0LBLdXW11q1bFzp/7dq1+uCDD7Rr1y41Nzfr4MGD2rhxo1asWKF58+ZN3F+SIM2dTF4FACCRYhqmkaSKigp1dXVp27Ztamtr07Jly1RXV6eioiJJUltbW8SaI4888oh6e3v1yiuv6B/+4R80Y8YM3XnnnfrlL385cX9FAg1NXiWMAACQCA6TAmMlPT09ysvLU3d3t3Jzc5P63g/s/lx//vKcdlTcovu/Pz+p7w0AQCoba/1mb5qrCC14xjANAAAJQRi5gp7LA+o83ydpcAIrAACYeISRKwjOF5kz3a3p2ZmWWwMAwOREGLkCloEHACDxCCNXMLTGCMvAAwCQKISRKwiuMXI9PSMAACQMYeQKQmuMEEYAAEgYwsgo/H6jL7vYrRcAgEQjjIziTPclXR7wK9Pp0ILvTLHdHAAAJi3CyCiCQzRFs3LkcnKbAABIFKrsKIKP9bLYGQAAiUUYGQXLwAMAkByEkVEEh2muZ/IqAAAJRRgZRQs9IwAAJAVhJIpL/T59/e0lSdKi2fSMAACQSISRKIK9IjOmZmpmTpbl1gAAMLkRRqIILgPPkzQAACQeYSSK0DLwTF4FACDhCCNRBNcYYfIqAACJRxiJIjhnhN16AQBIPMLIMMaYsN16GaYBACDRCCPDfHO+T719XjkcUtGsqbabAwDApEcYGSbYK7LgO1PkdjkttwYAgMmPMDIMT9IAAJBchJFheJIGAIDkIowMM7QnDT0jAAAkA2FkmObgY72svgoAQFIQRsL0e/1qPXdREj0jAAAkC2EkTOu5i/L5jaZmOeXJddtuDgAAaYEwEiY4ebU4P0cOh8NyawAASA+EkTDNTF4FACDpCCNhWkJrjDB5FQCAZCGMhGnuZI0RAACSjTASpqUz8CQNq68CAJA0hJEw5/sGJEkzpmZabgkAAOmDMBJmwGckSVkubgsAAMlC1Q3w+Y18/kAYcXJbAABIFqpuQL/XH/o6k54RAACShqob0O8bCiP0jAAAkDxU3YCInhEnq68CAJAshJGAgUDPSJYzg6XgAQBIIsJIQLBnhF4RAACSizASEOoZYfIqAABJReUN6PMSRgAAsIHKGxDsGcnkSRoAAJKKyhvQT88IAABWUHkD+sOepgEAAMlD5Q1gAisAAHZQeQOGHu3llgAAkExU3oB+H5vkAQBgA5U3gAmsAADYQeUN4NFeAADsoPIGBHtG3PSMAACQVFTeAPamAQDADsJIQD+P9gIAYAWVN4BHewEAsCOuyrtz504VFxcrOztbJSUlamhouOL5fX192rJli4qKiuR2u3X99ddr7969cTU4UVj0DAAAO1yxXlBbW6tNmzZp586dWrVqlX73u99pzZo1Onr0qBYuXBj1mgceeEBnz57Vnj17dMMNN6ijo0Ner3fcjZ9IPNoLAIAdMYeR7du3a/369dqwYYMkaceOHfrkk0+0a9cu1dTUjDj/448/1oEDB9Tc3KyZM2dKkq677rrxtToBBtibBgAAK2KqvP39/WpsbFR5eXnE8fLych06dCjqNR999JFKS0v1q1/9SvPnz9eSJUv0zDPP6NKlS6O+T19fn3p6eiJeicZGeQAA2BFTz0hnZ6d8Pp88Hk/EcY/Ho/b29qjXNDc367PPPlN2drY+/PBDdXZ26oknntC5c+dGnTdSU1OjrVu3xtK0cesLTmBlmAYAgKSKq/I6HJFrcRhjRhwL8vv9cjgc2rdvn1asWKF7771X27dv15tvvjlq70h1dbW6u7tDr1OnTsXTzJgMsDcNAABWxNQzkp+fL6fTOaIXpKOjY0RvSVBBQYHmz5+vvLy80LGlS5fKGKPTp09r8eLFI65xu91yu92xNG3c+r0+SfSMAACQbDFV3qysLJWUlKi+vj7ieH19vcrKyqJes2rVKp05c0bnz58PHTt+/LgyMjK0YMGCOJqcGMGeETc9IwAAJFXMlbeqqkqvv/669u7dq2PHjmnz5s1qbW1VZWWlpMEhlnXr1oXOf/DBBzVr1iw9+uijOnr0qD799FM9++yzeuyxxzRlypSJ+0vGiUd7AQCwI+ZHeysqKtTV1aVt27apra1Ny5YtU11dnYqKiiRJbW1tam1tDZ0/bdo01dfX6+///u9VWlqqWbNm6YEHHtCLL744cX/FBOhn114AAKxwGGOM7UZcTU9Pj/Ly8tTd3a3c3NyEvMf9rx7UkVPf6rV1pbr75ujzXwAAwNiNtX7TDRAw4GPXXgAAbCCMBDBnBAAAO6i8AazACgCAHVTegAF6RgAAsILKGxDqGSGMAACQVFTegOCcER7tBQAguai8AcwZAQDADipvQGijPIZpAABIKiqvJJ/fyOdn114AAGyg8mpovojErr0AACQblVdD80UkekYAAEg2Kq+G9YywHDwAAElFGNHQvjRZzgw5HIQRAACSiTAi9qUBAMAmqq/YsRcAAJsII5L66BkBAMAaqq+GnqZhKXgAAJKP6it27AUAwCaqr9iXBgAAm6i+Cnu0l54RAACSjuqrsEd76RkBACDpqL6S+gM79jKBFQCA5KP6ikXPAACwieqr8EXPuB0AACQb1VdDPSNuekYAAEg6qq8YpgEAwCaqr8JXYGVvGgAAko0wInpGAACwieorJrACAGAT1Vf0jAAAYBPVV2HLwdMzAgBA0lF9xUZ5AADYRPWV1McwDQAA1lB9JQ2wNw0AANZQfSX1e32S6BkBAMAGqq+GekaYMwIAQPJRfcWjvQAA2ET1Vfhy8NwOAACSjeorekYAALCJ6quhMMJGeQAAJB9hRGErsNIzAgBA0lF9NTRnxE0YAQAg6ai+kga8TGAFAMAWqq/C9qahZwQAgKSj+ip8Aiu3AwCAZKP6il17AQCwieqrsOXgGaYBACDp0r76+vxGPj970wAAYEvaV9/gfBGJnhEAAGxI++obnC8iMYEVAAAb0r76hveMsBw8AADJl/ZhZCDsSRqHgzACAECypX0YYcdeAADsSvsKHOwZYYgGAAA70j6M9NEzAgCAVWlfgdmXBgAAu+KqwDt37lRxcbGys7NVUlKihoaGMV138OBBuVwu3XLLLfG8bUKwYy8AAHbFXIFra2u1adMmbdmyRU1NTVq9erXWrFmj1tbWK17X3d2tdevW6Uc/+lHcjU0E9qUBAMCumCvw9u3btX79em3YsEFLly7Vjh07VFhYqF27dl3xuscff1wPPvigVq5cGXdjE2GAYRoAAKyKqQL39/ersbFR5eXlEcfLy8t16NChUa974403dPLkSb3wwgtjep++vj719PREvBIl9GgvPSMAAFgRUwXu7OyUz+eTx+OJOO7xeNTe3h71mhMnTui5557Tvn375HK5xvQ+NTU1ysvLC70KCwtjaWZM+gM79jJnBAAAO+KqwMNXKjXGRF291Ofz6cEHH9TWrVu1ZMmSMf/+6upqdXd3h16nTp2Kp5ljwqJnAADYNbauioD8/Hw5nc4RvSAdHR0jekskqbe3V4cPH1ZTU5OeeuopSZLf75cxRi6XS/v379edd9454jq32y232x1L0+JGGAEAwK6YKnBWVpZKSkpUX18fcby+vl5lZWUjzs/NzdVf/vIXHTlyJPSqrKzUjTfeqCNHjui2224bX+snwABP0wAAYFVMPSOSVFVVpYceekilpaVauXKlfv/736u1tVWVlZWSBodYvv76a7311lvKyMjQsmXLIq6fM2eOsrOzRxy3hZ4RAADsijmMVFRUqKurS9u2bVNbW5uWLVumuro6FRUVSZLa2tquuubItaSfvWkAALDKYYwxthtxNT09PcrLy1N3d7dyc3Mn9Hf/r/rjeun/nND/vH2hXrz/v03o7wYAIJ2NtX6n/djE0K69aX8rAACwIu0rMHNGAACwK+0rcLBnxE3PCAAAVqR9Be5nmAYAAKvSvgL3MUwDAIBVaV+BB9ibBgAAq9K+Avd7fZLoGQEAwJa0r8DBnhGWgwcAwI60r8A82gsAgF1pX4GDT9MQRgAAsCPtK3CwZ4QJrAAA2JH2FZhhGgAA7Er7CjzArr0AAFiV9mEkOGfETc8IAABWpH0FHmDOCAAAVqV9BeZpGgAA7Er7ChyawErPCAAAVqR9BWbXXgAA7Er7ChzsGWECKwAAdqR1Bfb5jfyDW9PQMwIAgCVpXYGDvSISE1gBALAlrStwcL6IRM8IAAC2pHUFDu8ZYQVWAADsSOswMhC2xojDQRgBAMCGtA4jrDECAIB9aV2FB1h9FQAA69K6Cvd52bEXAADb0jqMsC8NAAD2pXUVZsdeAADsS+sqHOoZIYwAAGBNWlfh4ARW9qUBAMCetK7C/QzTAABgXVpX4X7f4C55TGAFAMCetK7C9IwAAGBfWlfh0Aqs9IwAAGBNWlfhAZ6mAQDAurSuwvSMAABgX1pX4eA6IywHDwCAPekdRugZAQDAurSuwkNzRpyWWwIAQPpK6zASerTXxTANAAC2pHcYCS4Hz9M0AABYk9ZVeMDHomcAANiW1lW4jwmsAABYl9ZV2BvYm8ZFzwgAANZQhSUxfRUAAHsIIwAAwCrCCAAAsIowAgAArCKMAAAAqwgjAADAKsIIAACwijACAACsIowAAACrCCMAAMAqwggAALAqrjCyc+dOFRcXKzs7WyUlJWpoaBj13A8++EB33323Zs+erdzcXK1cuVKffPJJ3A0GAACTS8xhpLa2Vps2bdKWLVvU1NSk1atXa82aNWptbY16/qeffqq7775bdXV1amxs1B133KG1a9eqqalp3I0HAACpz2GMMbFccNttt+nWW2/Vrl27QseWLl2q+++/XzU1NWP6Hd/97ndVUVGh559/fkzn9/T0KC8vT93d3crNzY2luVe08Z0mffR/z+j5/36zHvtB8YT9XgAAMPb6HVPPSH9/vxobG1VeXh5xvLy8XIcOHRrT7/D7/ert7dXMmTNHPaevr089PT0RLwAAMDnFFEY6Ozvl8/nk8Xgijns8HrW3t4/pd/zmN7/RhQsX9MADD4x6Tk1NjfLy8kKvwsLCWJoJAABSSFwTWB0OR8T3xpgRx6J555139Itf/EK1tbWaM2fOqOdVV1eru7s79Dp16lQ8zQQAACnAFcvJ+fn5cjqdI3pBOjo6RvSWDFdbW6v169fr3Xff1V133XXFc91ut9xudyxNAwAAKSqmnpGsrCyVlJSovr4+4nh9fb3KyspGve6dd97RI488orffflv33XdffC0FAACTUkw9I5JUVVWlhx56SKWlpVq5cqV+//vfq7W1VZWVlZIGh1i+/vprvfXWW5IGg8i6dev00ksv6fbbbw/1qkyZMkV5eXkT+KcAAIBUFHMYqaioUFdXl7Zt26a2tjYtW7ZMdXV1KioqkiS1tbVFrDnyu9/9Tl6vV08++aSefPLJ0PGHH35Yb7755vj/AgAAkNJiDiOS9MQTT+iJJ56I+rPhAeNPf/pTPG8BAADSBHvTAAAAqwgjAADAKsIIAACwijACAACsIowAAACrCCMAAMAqwggAALCKMAIAAKwijAAAAKsIIwAAwCrCCAAAsIowAgAArCKMAAAAqwgjAADAKsIIAACwijACAACsIowAAACrCCMAAMAqwggAALCKMAIAAKwijAAAAKsIIwAAwCrCCAAAsIowAgAArCKMAAAAqwgjAADAKsIIAACwijACAACsIowAAACrCCMAAMAqwggAALCKMAIAAKwijAAAAKsIIwAAwCrCCAAAsIowAgAArCKMAAAAqwgjAADAKsIIAACwijACAACsIowAAACrCCMAAMAqwggAALCKMAIAAKwijAAAAKsIIwAAwCrCCAAAsIowAgAArCKMAAAAqwgjAADAKsIIAACwijACAACsIowAAACrCCMAAMAqwggAALCKMAIAAKyKK4zs3LlTxcXFys7OVklJiRoaGq54/oEDB1RSUqLs7GwtWrRIu3fvjquxAABg8ok5jNTW1mrTpk3asmWLmpqatHr1aq1Zs0atra1Rz29padG9996r1atXq6mpST//+c+1ceNGvf/+++NuPAAASH0xh5Ht27dr/fr12rBhg5YuXaodO3aosLBQu3btinr+7t27tXDhQu3YsUNLly7Vhg0b9Nhjj+nXv/71uBsPAABSX0xhpL+/X42NjSovL484Xl5erkOHDkW95vPPPx9x/j333KPDhw9rYGAg6jV9fX3q6emJeAEAgMkppjDS2dkpn88nj8cTcdzj8ai9vT3qNe3t7VHP93q96uzsjHpNTU2N8vLyQq/CwsJYmgkAAFJIXBNYHQ5HxPfGmBHHrnZ+tONB1dXV6u7uDr1OnToVTzOv6u6bPXryjuu1vDAvIb8fAABcnSuWk/Pz8+V0Okf0gnR0dIzo/QiaO3du1PNdLpdmzZoV9Rq32y232x1L0+Kydvk8rV0+L+HvAwAARhdTz0hWVpZKSkpUX18fcby+vl5lZWVRr1m5cuWI8/fv36/S0lJlZmbG2FwAADDZxDxMU1VVpddff1179+7VsWPHtHnzZrW2tqqyslLS4BDLunXrQudXVlbqq6++UlVVlY4dO6a9e/dqz549euaZZyburwAAACkrpmEaSaqoqFBXV5e2bdumtrY2LVu2THV1dSoqKpIktbW1Raw5UlxcrLq6Om3evFmvvvqq5s2bp5dfflk/+clPJu6vAAAAKcthgrNJr2E9PT3Ky8tTd3e3cnNzbTcHAACMwVjrN3vTAAAAqwgjAADAKsIIAACwijACAACsIowAAACrCCMAAMAqwggAALCKMAIAAKwijAAAAKtiXg7ehuAisT09PZZbAgAAxipYt6+22HtKhJHe3l5JUmFhoeWWAACAWPX29iovL2/Un6fE3jR+v19nzpzR9OnT5XA4Juz39vT0qLCwUKdOnWLPmwTjXicH9zk5uM/JwX1OjkTeZ2OMent7NW/ePGVkjD4zJCV6RjIyMrRgwYKE/f7c3Fz+oScJ9zo5uM/JwX1ODu5zciTqPl+pRySICawAAMAqwggAALAqrcOI2+3WCy+8ILfbbbspkx73Ojm4z8nBfU4O7nNyXAv3OSUmsAIAgMkrrXtGAACAfYQRAABgFWEEAABYRRgBAABWTfowsnPnThUXFys7O1slJSVqaGi44vkHDhxQSUmJsrOztWjRIu3evTtJLU1tsdznDz74QHfffbdmz56t3NxcrVy5Up988kkSW5vaYv03HXTw4EG5XC7dcsstiW3gJBHrfe7r69OWLVtUVFQkt9ut66+/Xnv37k1Sa1NXrPd53759Wr58uaZOnaqCggI9+uij6urqSlJrU9Onn36qtWvXat68eXI4HPrDH/5w1WuSXgvNJPYv//IvJjMz07z22mvm6NGj5umnnzY5OTnmq6++inp+c3OzmTp1qnn66afN0aNHzWuvvWYyMzPNe++9l+SWp5ZY7/PTTz9tfvnLX5o///nP5vjx46a6utpkZmaa//qv/0pyy1NPrPc66NtvvzWLFi0y5eXlZvny5clpbAqL5z7/+Mc/Nrfddpupr683LS0t5j/+4z/MwYMHk9jq1BPrfW5oaDAZGRnmpZdeMs3NzaahocF897vfNffff3+SW55a6urqzJYtW8z7779vJJkPP/zwiufbqIWTOoysWLHCVFZWRhy76aabzHPPPRf1/H/8x380N910U8Sxxx9/3Nx+++0Ja+NkEOt9jubmm282W7duneimTTrx3uuKigrzT//0T+aFF14gjIxBrPf5j3/8o8nLyzNdXV3JaN6kEet9/ud//mezaNGiiGMvv/yyWbBgQcLaONmMJYzYqIWTdpimv79fjY2NKi8vjzheXl6uQ4cORb3m888/H3H+Pffco8OHD2tgYCBhbU1l8dzn4fx+v3p7ezVz5sxENHHSiPdev/HGGzp58qReeOGFRDdxUojnPn/00UcqLS3Vr371K82fP19LlizRM888o0uXLiWjySkpnvtcVlam06dPq66uTsYYnT17Vu+9957uu+++ZDQ5bdiohSmxUV48Ojs75fP55PF4Io57PB61t7dHvaa9vT3q+V6vV52dnSooKEhYe1NVPPd5uN/85je6cOGCHnjggUQ0cdKI516fOHFCzz33nBoaGuRyTdr/uU+oeO5zc3OzPvvsM2VnZ+vDDz9UZ2ennnjiCZ07d455I6OI5z6XlZVp3759qqio0OXLl+X1evXjH/9Yv/3tb5PR5LRhoxZO2p6RIIfDEfG9MWbEsaudH+04IsV6n4Peeecd/eIXv1Btba3mzJmTqOZNKmO91z6fTw8++KC2bt2qJUuWJKt5k0Ys/6b9fr8cDof27dunFStW6N5779X27dv15ptv0jtyFbHc56NHj2rjxo16/vnn1djYqI8//lgtLS2qrKxMRlPTSrJr4aT9qJSfny+n0zkiYXd0dIxIfEFz586Ner7L5dKsWbMS1tZUFs99DqqtrdX69ev17rvv6q677kpkMyeFWO91b2+vDh8+rKamJj311FOSBoumMUYul0v79+/XnXfemZS2p5J4/k0XFBRo/vz5EVulL126VMYYnT59WosXL05om1NRPPe5pqZGq1at0rPPPitJ+t73vqecnBytXr1aL774Ir3XE8RGLZy0PSNZWVkqKSlRfX19xPH6+nqVlZVFvWblypUjzt+/f79KS0uVmZmZsLamsnjuszTYI/LII4/o7bffZrx3jGK917m5ufrLX/6iI0eOhF6VlZW68cYbdeTIEd12223JanpKieff9KpVq3TmzBmdP38+dOz48ePKyMjQggULEtreVBXPfb548aIyMiLLltPplDT0yR3jZ6UWJmxq7DUg+NjYnj17zNGjR82mTZtMTk6O+fLLL40xxjz33HPmoYceCp0ffJxp8+bN5ujRo2bPnj082jsGsd7nt99+27hcLvPqq6+atra20Ovbb7+19SekjFjv9XA8TTM2sd7n3t5es2DBAvPTn/7UfPHFF+bAgQNm8eLFZsOGDbb+hJQQ631+4403jMvlMjt37jQnT540n332mSktLTUrVqyw9SekhN7eXtPU1GSampqMJLN9+3bT1NQUeoT6WqiFkzqMGGPMq6++aoqKikxWVpa59dZbzYEDB0I/e/jhh80Pf/jDiPP/9Kc/me9///smKyvLXHfddWbXrl1JbnFqiuU+//CHPzSSRrwefvjh5Dc8BcX6bzocYWTsYr3Px44dM3fddZeZMmWKWbBggamqqjIXL15McqtTT6z3+eWXXzY333yzmTJliikoKDA/+9nPzOnTp5Pc6tTyb//2b1f8/9xroRY6jKFvCwAA2DNp54wAAIDUQBgBAABWEUYAAIBVhBEAAGAVYQQAAFhFGAEAAFYRRgAAgFWEEQAAYBVhBAAAWEUYAQAAVhFGAACAVYQRAABg1f8Hjkfsk3RKhPgAAAAASUVORK5CYII=",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.plot(fpr, tpr)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过如下代码则可以快速求出模型的AUC值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.9736722483245008\n"
     ]
    }
   ],
   "source": [
    "from sklearn.metrics import roc_auc_score  # 引入roc_auc_score函数\n",
    "score = roc_auc_score(y_test, y_pred_proba[:,1])\n",
    "print(score)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**4.特征重要性评估**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.        , 0.59810862, 0.14007392, 0.10638659, 0.00456495,\n",
       "       0.15086592])"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.feature_importances_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>特征名称</th>\n",
       "      <th>特征重要性</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>满意度</td>\n",
       "      <td>0.598109</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>工龄</td>\n",
       "      <td>0.150866</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>考核得分</td>\n",
       "      <td>0.140074</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>工程数量</td>\n",
       "      <td>0.106387</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>月工时</td>\n",
       "      <td>0.004565</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>工资</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   特征名称     特征重要性\n",
       "1   满意度  0.598109\n",
       "5    工龄  0.150866\n",
       "2  考核得分  0.140074\n",
       "3  工程数量  0.106387\n",
       "4   月工时  0.004565\n",
       "0    工资  0.000000"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 通过DataFrame进行展示，并根据重要性进行倒序排列\n",
    "features = X.columns  # 获取特征名称\n",
    "importances = model.feature_importances_  # 获取特征重要性\n",
    "\n",
    "# 通过二维表格形式显示\n",
    "importances_df = pd.DataFrame()\n",
    "importances_df['特征名称'] = features\n",
    "importances_df['特征重要性'] = importances\n",
    "importances_df.sort_values('特征重要性', ascending=False) # 特征重要性降序排列"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**5.2.3 决策树模型可视化呈现及决策树要点理解**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过graphviz插件进行决策树可视化，graphviz插件的更详细的使用教程可以查看该文档:https://shimo.im/docs/Dcgw8H6WxgWrc8hq/ "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "可视化文件result.pdf已经保存在代码所在文件夹！\n"
     ]
    }
   ],
   "source": [
    "# 1.如果不用显示中文，那么通过如下代码即可：\n",
    "# !pip3 install pygraphviz\n",
    "from sklearn.tree import export_graphviz\n",
    "import graphviz\n",
    "import os  # 以下这两行是手动进行环境变量配置，防止在本机环境的变量部署失败\n",
    "os.environ['PATH'] = os.pathsep + r'D:\\ProgramFiles\\Graphviz\\bin'\n",
    "\n",
    "dot_data = export_graphviz(model, out_file=None, class_names=['0', '1'])\n",
    "graph = graphviz.Source(dot_data)\n",
    "\n",
    "graph.render(\"result\")  # 导出成PDF文件\n",
    "print('可视化文件result.pdf已经保存在代码所在文件夹！')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<!-- Generated by graphviz version 10.0.1 (20240210.2158)\n",
       " -->\n",
       "<!-- Title: Tree Pages: 1 -->\n",
       "<svg width=\"1100pt\" height=\"462pt\"\n",
       " viewBox=\"0.00 0.00 1100.25 461.50\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
       "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 457.5)\">\n",
       "<title>Tree</title>\n",
       "<polygon fill=\"white\" stroke=\"none\" points=\"-4,4 -4,-457.5 1096.25,-457.5 1096.25,4 -4,4\"/>\n",
       "<!-- 0 -->\n",
       "<g id=\"node1\" class=\"node\">\n",
       "<title>0</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"615.5,-453.5 469.75,-453.5 469.75,-363 615.5,-363 615.5,-453.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-436.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">x[1] &lt;= 4.65</text>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-419.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.365</text>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-403.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 12000</text>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-386.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [9120, 2880]</text>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-370.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 1 -->\n",
       "<g id=\"node2\" class=\"node\">\n",
       "<title>1</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"483.5,-327 337.75,-327 337.75,-236.5 483.5,-236.5 483.5,-327\"/>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-309.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">x[3] &lt;= 2.5</text>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-293.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.477</text>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-276.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 3367</text>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-260.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [1325, 2042]</text>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-243.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 0&#45;&gt;1 -->\n",
       "<g id=\"edge1\" class=\"edge\">\n",
       "<title>0&#45;&gt;1</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M495.34,-362.65C485.96,-353.8 476.03,-344.43 466.39,-335.35\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"468.8,-332.81 459.12,-328.49 464,-337.9 468.8,-332.81\"/>\n",
       "<text text-anchor=\"middle\" x=\"458.75\" y=\"-347.39\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">True</text>\n",
       "</g>\n",
       "<!-- 8 -->\n",
       "<g id=\"node9\" class=\"node\">\n",
       "<title>8</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"746.38,-327 608.88,-327 608.88,-236.5 746.38,-236.5 746.38,-327\"/>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-309.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">x[5] &lt;= 4.5</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-293.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.175</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-276.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 8633</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-260.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [7795, 838]</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-243.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 0&#45;&gt;8 -->\n",
       "<g id=\"edge8\" class=\"edge\">\n",
       "<title>0&#45;&gt;8</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M590.98,-362.65C600.68,-353.71 610.94,-344.25 620.89,-335.07\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"623.06,-337.83 628.04,-328.48 618.32,-332.69 623.06,-337.83\"/>\n",
       "<text text-anchor=\"middle\" x=\"628.14\" y=\"-347.38\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">False</text>\n",
       "</g>\n",
       "<!-- 2 -->\n",
       "<g id=\"node3\" class=\"node\">\n",
       "<title>2</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"271.38,-200.5 133.88,-200.5 133.88,-110 271.38,-110 271.38,-200.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-183.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">x[2] &lt;= 0.575</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-166.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.208</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-150.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1396</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-133.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [165, 1231]</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-117.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 1&#45;&gt;2 -->\n",
       "<g id=\"edge2\" class=\"edge\">\n",
       "<title>1&#45;&gt;2</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M337.29,-236.86C319.25,-226.06 299.85,-214.45 281.56,-203.49\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"283.49,-200.57 273.11,-198.44 279.89,-206.58 283.49,-200.57\"/>\n",
       "</g>\n",
       "<!-- 5 -->\n",
       "<g id=\"node6\" class=\"node\">\n",
       "<title>5</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"479.38,-200.5 341.88,-200.5 341.88,-110 479.38,-110 479.38,-200.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-183.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">x[1] &lt;= 1.15</text>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-166.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.484</text>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-150.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1971</text>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-133.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [1160, 811]</text>\n",
       "<text text-anchor=\"middle\" x=\"410.62\" y=\"-117.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 1&#45;&gt;5 -->\n",
       "<g id=\"edge5\" class=\"edge\">\n",
       "<title>1&#45;&gt;5</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M410.62,-236.15C410.62,-228.47 410.62,-220.39 410.62,-212.44\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"414.13,-212.47 410.63,-202.47 407.13,-212.47 414.13,-212.47\"/>\n",
       "</g>\n",
       "<!-- 3 -->\n",
       "<g id=\"node4\" class=\"node\">\n",
       "<title>3</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"129.25,-74 0,-74 0,0 129.25,0 129.25,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"64.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.102</text>\n",
       "<text text-anchor=\"middle\" x=\"64.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1295</text>\n",
       "<text text-anchor=\"middle\" x=\"64.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [70, 1225]</text>\n",
       "<text text-anchor=\"middle\" x=\"64.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 2&#45;&gt;3 -->\n",
       "<g id=\"edge3\" class=\"edge\">\n",
       "<title>2&#45;&gt;3</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M149.67,-109.64C138.75,-100.44 127.27,-90.77 116.39,-81.61\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"118.89,-79.13 108.99,-75.37 114.38,-84.49 118.89,-79.13\"/>\n",
       "</g>\n",
       "<!-- 4 -->\n",
       "<g id=\"node5\" class=\"node\">\n",
       "<title>4</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"257.88,-74 147.38,-74 147.38,0 257.88,0 257.88,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.112</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 101</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [95, 6]</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 2&#45;&gt;4 -->\n",
       "<g id=\"edge4\" class=\"edge\">\n",
       "<title>2&#45;&gt;4</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M202.62,-109.64C202.62,-101.81 202.62,-93.63 202.62,-85.72\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"206.13,-85.91 202.63,-75.91 199.13,-85.91 206.13,-85.91\"/>\n",
       "</g>\n",
       "<!-- 6 -->\n",
       "<g id=\"node7\" class=\"node\">\n",
       "<title>6</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"389,-74 276.25,-74 276.25,0 389,0 389,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"332.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.0</text>\n",
       "<text text-anchor=\"middle\" x=\"332.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 714</text>\n",
       "<text text-anchor=\"middle\" x=\"332.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [0, 714]</text>\n",
       "<text text-anchor=\"middle\" x=\"332.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 5&#45;&gt;6 -->\n",
       "<g id=\"edge6\" class=\"edge\">\n",
       "<title>5&#45;&gt;6</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M380.69,-109.64C375.01,-101.17 369.06,-92.3 363.35,-83.8\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"366.37,-82 357.89,-75.65 360.55,-85.9 366.37,-82\"/>\n",
       "</g>\n",
       "<!-- 7 -->\n",
       "<g id=\"node8\" class=\"node\">\n",
       "<title>7</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"536.25,-74 407,-74 407,0 536.25,0 536.25,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"471.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.142</text>\n",
       "<text text-anchor=\"middle\" x=\"471.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1257</text>\n",
       "<text text-anchor=\"middle\" x=\"471.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [1160, 97]</text>\n",
       "<text text-anchor=\"middle\" x=\"471.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 5&#45;&gt;7 -->\n",
       "<g id=\"edge7\" class=\"edge\">\n",
       "<title>5&#45;&gt;7</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M434.03,-109.64C438.38,-101.35 442.93,-92.68 447.31,-84.34\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"450.28,-86.22 451.82,-75.73 444.08,-82.96 450.28,-86.22\"/>\n",
       "</g>\n",
       "<!-- 9 -->\n",
       "<g id=\"node10\" class=\"node\">\n",
       "<title>9</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"746.38,-200.5 608.88,-200.5 608.88,-110 746.38,-110 746.38,-200.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-183.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">x[4] &lt;= 290.5</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-166.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.031</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-150.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 7064</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-133.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [6952, 112]</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-117.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 8&#45;&gt;9 -->\n",
       "<g id=\"edge9\" class=\"edge\">\n",
       "<title>8&#45;&gt;9</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M677.62,-236.15C677.62,-228.47 677.62,-220.39 677.62,-212.44\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"681.13,-212.47 677.63,-202.47 674.13,-212.47 681.13,-212.47\"/>\n",
       "</g>\n",
       "<!-- 12 -->\n",
       "<g id=\"node13\" class=\"node\">\n",
       "<title>12</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"949.25,-200.5 820,-200.5 820,-110 949.25,-110 949.25,-200.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-183.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">x[2] &lt;= 0.805</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-166.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.497</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-150.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1569</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-133.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [843, 726]</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-117.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 8&#45;&gt;12 -->\n",
       "<g id=\"edge12\" class=\"edge\">\n",
       "<title>8&#45;&gt;12</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M746.84,-239.12C767.13,-226.92 789.41,-213.52 810.02,-201.12\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"811.61,-204.25 818.38,-196.09 808,-198.25 811.61,-204.25\"/>\n",
       "</g>\n",
       "<!-- 10 -->\n",
       "<g id=\"node11\" class=\"node\">\n",
       "<title>10</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"691.38,-74 553.88,-74 553.88,0 691.38,0 691.38,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"622.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.029</text>\n",
       "<text text-anchor=\"middle\" x=\"622.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 7056</text>\n",
       "<text text-anchor=\"middle\" x=\"622.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [6952, 104]</text>\n",
       "<text text-anchor=\"middle\" x=\"622.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 9&#45;&gt;10 -->\n",
       "<g id=\"edge10\" class=\"edge\">\n",
       "<title>9&#45;&gt;10</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M656.52,-109.64C652.64,-101.44 648.59,-92.87 644.68,-84.62\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"647.93,-83.3 640.49,-75.76 641.6,-86.3 647.93,-83.3\"/>\n",
       "</g>\n",
       "<!-- 11 -->\n",
       "<g id=\"node12\" class=\"node\">\n",
       "<title>11</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"805.75,-74 709.5,-74 709.5,0 805.75,0 805.75,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"757.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.0</text>\n",
       "<text text-anchor=\"middle\" x=\"757.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 8</text>\n",
       "<text text-anchor=\"middle\" x=\"757.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [0, 8]</text>\n",
       "<text text-anchor=\"middle\" x=\"757.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 9&#45;&gt;11 -->\n",
       "<g id=\"edge11\" class=\"edge\">\n",
       "<title>9&#45;&gt;11</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M708.33,-109.64C714.15,-101.17 720.26,-92.3 726.11,-83.8\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"728.94,-85.86 731.72,-75.64 723.17,-81.89 728.94,-85.86\"/>\n",
       "</g>\n",
       "<!-- 13 -->\n",
       "<g id=\"node14\" class=\"node\">\n",
       "<title>13</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"945.12,-74 824.12,-74 824.12,0 945.12,0 945.12,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.087</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 590</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [563, 27]</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 12&#45;&gt;13 -->\n",
       "<g id=\"edge13\" class=\"edge\">\n",
       "<title>12&#45;&gt;13</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M884.62,-109.64C884.62,-101.81 884.62,-93.63 884.62,-85.72\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"888.13,-85.91 884.63,-75.91 881.13,-85.91 888.13,-85.91\"/>\n",
       "</g>\n",
       "<!-- 14 -->\n",
       "<g id=\"node15\" class=\"node\">\n",
       "<title>14</title>\n",
       "<polygon fill=\"none\" stroke=\"black\" points=\"1092.25,-74 963,-74 963,0 1092.25,0 1092.25,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"1027.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.408</text>\n",
       "<text text-anchor=\"middle\" x=\"1027.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 979</text>\n",
       "<text text-anchor=\"middle\" x=\"1027.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [280, 699]</text>\n",
       "<text text-anchor=\"middle\" x=\"1027.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 12&#45;&gt;14 -->\n",
       "<g id=\"edge14\" class=\"edge\">\n",
       "<title>12&#45;&gt;14</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M939.5,-109.64C950.81,-100.44 962.71,-90.77 973.98,-81.61\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"976.13,-84.37 981.68,-75.35 971.72,-78.94 976.13,-84.37\"/>\n",
       "</g>\n",
       "</g>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<graphviz.sources.Source at 0x274cc7953d0>"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "graph  # 在Jupyter Notebook中可以直接输入变量名查看可视化图片"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<!-- Generated by graphviz version 10.0.1 (20240210.2158)\n",
       " -->\n",
       "<!-- Title: Tree Pages: 1 -->\n",
       "<svg width=\"1100pt\" height=\"462pt\"\n",
       " viewBox=\"0.00 0.00 1100.25 461.50\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
       "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 457.5)\">\n",
       "<title>Tree</title>\n",
       "<polygon fill=\"white\" stroke=\"none\" points=\"-4,4 -4,-457.5 1096.25,-457.5 1096.25,4 -4,4\"/>\n",
       "<!-- 0 -->\n",
       "<g id=\"node1\" class=\"node\">\n",
       "<title>0</title>\n",
       "<polygon fill=\"#eda978\" stroke=\"black\" points=\"615.5,-453.5 469.75,-453.5 469.75,-363 615.5,-363 615.5,-453.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-436.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">satisfication &lt;= 4.65</text>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-419.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.365</text>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-403.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 12000</text>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-386.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [9120, 2880]</text>\n",
       "<text text-anchor=\"middle\" x=\"542.62\" y=\"-370.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 1 -->\n",
       "<g id=\"node2\" class=\"node\">\n",
       "<title>1</title>\n",
       "<polygon fill=\"#b9ddf6\" stroke=\"black\" points=\"484.5,-327 338.75,-327 338.75,-236.5 484.5,-236.5 484.5,-327\"/>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-309.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">project_num &lt;= 2.5</text>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-293.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.477</text>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-276.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 3367</text>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-260.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [1325, 2042]</text>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-243.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 0&#45;&gt;1 -->\n",
       "<g id=\"edge1\" class=\"edge\">\n",
       "<title>0&#45;&gt;1</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M495.7,-362.65C486.39,-353.8 476.53,-344.43 466.97,-335.35\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"469.42,-332.85 459.76,-328.5 464.6,-337.92 469.42,-332.85\"/>\n",
       "<text text-anchor=\"middle\" x=\"459.3\" y=\"-347.39\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">True</text>\n",
       "</g>\n",
       "<!-- 8 -->\n",
       "<g id=\"node9\" class=\"node\">\n",
       "<title>8</title>\n",
       "<polygon fill=\"#e88f4e\" stroke=\"black\" points=\"746.38,-327 608.88,-327 608.88,-236.5 746.38,-236.5 746.38,-327\"/>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-309.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">year &lt;= 4.5</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-293.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.175</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-276.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 8633</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-260.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [7795, 838]</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-243.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 0&#45;&gt;8 -->\n",
       "<g id=\"edge8\" class=\"edge\">\n",
       "<title>0&#45;&gt;8</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M590.98,-362.65C600.68,-353.71 610.94,-344.25 620.89,-335.07\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"623.06,-337.83 628.04,-328.48 618.32,-332.69 623.06,-337.83\"/>\n",
       "<text text-anchor=\"middle\" x=\"628.14\" y=\"-347.38\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">False</text>\n",
       "</g>\n",
       "<!-- 2 -->\n",
       "<g id=\"node3\" class=\"node\">\n",
       "<title>2</title>\n",
       "<polygon fill=\"#54aae8\" stroke=\"black\" points=\"271.38,-200.5 133.88,-200.5 133.88,-110 271.38,-110 271.38,-200.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-183.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">score &lt;= 0.575</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-166.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.208</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-150.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1396</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-133.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [165, 1231]</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-117.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 1&#45;&gt;2 -->\n",
       "<g id=\"edge2\" class=\"edge\">\n",
       "<title>1&#45;&gt;2</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M338.53,-237.21C320.08,-226.22 300.2,-214.37 281.49,-203.23\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"283.5,-200.35 273.12,-198.24 279.92,-206.37 283.5,-200.35\"/>\n",
       "</g>\n",
       "<!-- 5 -->\n",
       "<g id=\"node6\" class=\"node\">\n",
       "<title>5</title>\n",
       "<polygon fill=\"#f7d9c3\" stroke=\"black\" points=\"483,-200.5 340.25,-200.5 340.25,-110 483,-110 483,-200.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-183.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">satisfication &lt;= 1.15</text>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-166.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.484</text>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-150.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1971</text>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-133.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [1160, 811]</text>\n",
       "<text text-anchor=\"middle\" x=\"411.62\" y=\"-117.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 1&#45;&gt;5 -->\n",
       "<g id=\"edge5\" class=\"edge\">\n",
       "<title>1&#45;&gt;5</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M411.62,-236.15C411.62,-228.47 411.62,-220.39 411.62,-212.44\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"415.13,-212.47 411.63,-202.47 408.13,-212.47 415.13,-212.47\"/>\n",
       "</g>\n",
       "<!-- 3 -->\n",
       "<g id=\"node4\" class=\"node\">\n",
       "<title>3</title>\n",
       "<polygon fill=\"#44a3e6\" stroke=\"black\" points=\"129.25,-74 0,-74 0,0 129.25,0 129.25,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"64.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.102</text>\n",
       "<text text-anchor=\"middle\" x=\"64.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1295</text>\n",
       "<text text-anchor=\"middle\" x=\"64.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [70, 1225]</text>\n",
       "<text text-anchor=\"middle\" x=\"64.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 2&#45;&gt;3 -->\n",
       "<g id=\"edge3\" class=\"edge\">\n",
       "<title>2&#45;&gt;3</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M149.67,-109.64C138.75,-100.44 127.27,-90.77 116.39,-81.61\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"118.89,-79.13 108.99,-75.37 114.38,-84.49 118.89,-79.13\"/>\n",
       "</g>\n",
       "<!-- 4 -->\n",
       "<g id=\"node5\" class=\"node\">\n",
       "<title>4</title>\n",
       "<polygon fill=\"#e78946\" stroke=\"black\" points=\"257.88,-74 147.38,-74 147.38,0 257.88,0 257.88,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.112</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 101</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [95, 6]</text>\n",
       "<text text-anchor=\"middle\" x=\"202.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 2&#45;&gt;4 -->\n",
       "<g id=\"edge4\" class=\"edge\">\n",
       "<title>2&#45;&gt;4</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M202.62,-109.64C202.62,-101.81 202.62,-93.63 202.62,-85.72\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"206.13,-85.91 202.63,-75.91 199.13,-85.91 206.13,-85.91\"/>\n",
       "</g>\n",
       "<!-- 6 -->\n",
       "<g id=\"node7\" class=\"node\">\n",
       "<title>6</title>\n",
       "<polygon fill=\"#399de5\" stroke=\"black\" points=\"389,-74 276.25,-74 276.25,0 389,0 389,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"332.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.0</text>\n",
       "<text text-anchor=\"middle\" x=\"332.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 714</text>\n",
       "<text text-anchor=\"middle\" x=\"332.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [0, 714]</text>\n",
       "<text text-anchor=\"middle\" x=\"332.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 5&#45;&gt;6 -->\n",
       "<g id=\"edge6\" class=\"edge\">\n",
       "<title>5&#45;&gt;6</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M381.31,-109.64C375.55,-101.17 369.52,-92.3 363.75,-83.8\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"366.72,-81.95 358.21,-75.65 360.93,-85.88 366.72,-81.95\"/>\n",
       "</g>\n",
       "<!-- 7 -->\n",
       "<g id=\"node8\" class=\"node\">\n",
       "<title>7</title>\n",
       "<polygon fill=\"#e78c4a\" stroke=\"black\" points=\"536.25,-74 407,-74 407,0 536.25,0 536.25,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"471.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.142</text>\n",
       "<text text-anchor=\"middle\" x=\"471.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1257</text>\n",
       "<text text-anchor=\"middle\" x=\"471.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [1160, 97]</text>\n",
       "<text text-anchor=\"middle\" x=\"471.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 5&#45;&gt;7 -->\n",
       "<g id=\"edge7\" class=\"edge\">\n",
       "<title>5&#45;&gt;7</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M434.65,-109.64C438.88,-101.44 443.3,-92.87 447.56,-84.62\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"450.67,-86.23 452.15,-75.74 444.45,-83.02 450.67,-86.23\"/>\n",
       "</g>\n",
       "<!-- 9 -->\n",
       "<g id=\"node10\" class=\"node\">\n",
       "<title>9</title>\n",
       "<polygon fill=\"#e5833c\" stroke=\"black\" points=\"746.38,-200.5 608.88,-200.5 608.88,-110 746.38,-110 746.38,-200.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-183.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">hours &lt;= 290.5</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-166.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.031</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-150.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 7064</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-133.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [6952, 112]</text>\n",
       "<text text-anchor=\"middle\" x=\"677.62\" y=\"-117.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 8&#45;&gt;9 -->\n",
       "<g id=\"edge9\" class=\"edge\">\n",
       "<title>8&#45;&gt;9</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M677.62,-236.15C677.62,-228.47 677.62,-220.39 677.62,-212.44\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"681.13,-212.47 677.63,-202.47 674.13,-212.47 681.13,-212.47\"/>\n",
       "</g>\n",
       "<!-- 12 -->\n",
       "<g id=\"node13\" class=\"node\">\n",
       "<title>12</title>\n",
       "<polygon fill=\"#fbeee4\" stroke=\"black\" points=\"949.25,-200.5 820,-200.5 820,-110 949.25,-110 949.25,-200.5\"/>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-183.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">score &lt;= 0.805</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-166.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.497</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-150.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 1569</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-133.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [843, 726]</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-117.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 8&#45;&gt;12 -->\n",
       "<g id=\"edge12\" class=\"edge\">\n",
       "<title>8&#45;&gt;12</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M746.84,-239.12C767.13,-226.92 789.41,-213.52 810.02,-201.12\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"811.61,-204.25 818.38,-196.09 808,-198.25 811.61,-204.25\"/>\n",
       "</g>\n",
       "<!-- 10 -->\n",
       "<g id=\"node11\" class=\"node\">\n",
       "<title>10</title>\n",
       "<polygon fill=\"#e5833c\" stroke=\"black\" points=\"691.38,-74 553.88,-74 553.88,0 691.38,0 691.38,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"622.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.029</text>\n",
       "<text text-anchor=\"middle\" x=\"622.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 7056</text>\n",
       "<text text-anchor=\"middle\" x=\"622.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [6952, 104]</text>\n",
       "<text text-anchor=\"middle\" x=\"622.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 9&#45;&gt;10 -->\n",
       "<g id=\"edge10\" class=\"edge\">\n",
       "<title>9&#45;&gt;10</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M656.52,-109.64C652.64,-101.44 648.59,-92.87 644.68,-84.62\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"647.93,-83.3 640.49,-75.76 641.6,-86.3 647.93,-83.3\"/>\n",
       "</g>\n",
       "<!-- 11 -->\n",
       "<g id=\"node12\" class=\"node\">\n",
       "<title>11</title>\n",
       "<polygon fill=\"#399de5\" stroke=\"black\" points=\"805.75,-74 709.5,-74 709.5,0 805.75,0 805.75,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"757.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.0</text>\n",
       "<text text-anchor=\"middle\" x=\"757.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 8</text>\n",
       "<text text-anchor=\"middle\" x=\"757.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [0, 8]</text>\n",
       "<text text-anchor=\"middle\" x=\"757.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 9&#45;&gt;11 -->\n",
       "<g id=\"edge11\" class=\"edge\">\n",
       "<title>9&#45;&gt;11</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M708.33,-109.64C714.15,-101.17 720.26,-92.3 726.11,-83.8\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"728.94,-85.86 731.72,-75.64 723.17,-81.89 728.94,-85.86\"/>\n",
       "</g>\n",
       "<!-- 13 -->\n",
       "<g id=\"node14\" class=\"node\">\n",
       "<title>13</title>\n",
       "<polygon fill=\"#e68742\" stroke=\"black\" points=\"945.12,-74 824.12,-74 824.12,0 945.12,0 945.12,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.087</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 590</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [563, 27]</text>\n",
       "<text text-anchor=\"middle\" x=\"884.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 0</text>\n",
       "</g>\n",
       "<!-- 12&#45;&gt;13 -->\n",
       "<g id=\"edge13\" class=\"edge\">\n",
       "<title>12&#45;&gt;13</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M884.62,-109.64C884.62,-101.81 884.62,-93.63 884.62,-85.72\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"888.13,-85.91 884.63,-75.91 881.13,-85.91 888.13,-85.91\"/>\n",
       "</g>\n",
       "<!-- 14 -->\n",
       "<g id=\"node15\" class=\"node\">\n",
       "<title>14</title>\n",
       "<polygon fill=\"#88c4ef\" stroke=\"black\" points=\"1092.25,-74 963,-74 963,0 1092.25,0 1092.25,-74\"/>\n",
       "<text text-anchor=\"middle\" x=\"1027.62\" y=\"-56.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">gini = 0.408</text>\n",
       "<text text-anchor=\"middle\" x=\"1027.62\" y=\"-40.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">samples = 979</text>\n",
       "<text text-anchor=\"middle\" x=\"1027.62\" y=\"-23.7\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">value = [280, 699]</text>\n",
       "<text text-anchor=\"middle\" x=\"1027.62\" y=\"-7.2\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">class = 1</text>\n",
       "</g>\n",
       "<!-- 12&#45;&gt;14 -->\n",
       "<g id=\"edge14\" class=\"edge\">\n",
       "<title>12&#45;&gt;14</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M939.5,-109.64C950.81,-100.44 962.71,-90.77 973.98,-81.61\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"976.13,-84.37 981.68,-75.35 971.72,-78.94 976.13,-84.37\"/>\n",
       "</g>\n",
       "</g>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<graphviz.sources.Source at 0x274cc741710>"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 添加名称（feature_names）和填充颜色（filled=True）\n",
    "dot_data = export_graphviz(model, out_file=None, feature_names=['income', 'satisfication', 'score', 'project_num', 'hours', 'year'], class_names=['0', '1'], filled=True)  \n",
    "graph = graphviz.Source(dot_data)\n",
    "\n",
    "graph"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "决策树模型.png已经保存在代码所在文件夹！\n",
      "决策树模型.pdf已经保存在代码所在文件夹！\n"
     ]
    }
   ],
   "source": [
    "# 2.如果想显示中文，需要使用如下代码\n",
    "from sklearn.tree import export_graphviz\n",
    "import graphviz\n",
    "import os  # 以下这两行是手动进行环境变量配置，防止在本机环境的变量部署失败\n",
    "os.environ['PATH'] = os.pathsep + r'C:\\Program Files (x86)\\Graphviz2.38\\bin'\n",
    "\n",
    "# 生成dot_data\n",
    "dot_data = export_graphviz(model, out_file=None, feature_names=X_train.columns, class_names=['不离职', '离职'], rounded=True, filled=True)\n",
    "\n",
    "# 将生成的dot_data内容导入到txt文件中\n",
    "f = open('dot_data.txt', 'w')\n",
    "f.write(dot_data)\n",
    "f.close()\n",
    "\n",
    "# 修改字体设置，避免中文乱码！\n",
    "import re\n",
    "f_old = open('dot_data.txt', 'r')\n",
    "f_new = open('dot_data_new.txt', 'w', encoding='utf-8')\n",
    "for line in f_old:\n",
    "    if 'fontname' in line:\n",
    "        font_re = 'fontname=(.*?)]'\n",
    "        old_font = re.findall(font_re, line)[0]\n",
    "        line = line.replace(old_font, 'SimHei')\n",
    "    f_new.write(line)\n",
    "f_old.close()\n",
    "f_new.close()\n",
    "\n",
    "# 以PNG的图片形式存储生成的可视化文件\n",
    "os.system('dot -Tpng dot_data_new.txt -o 决策树模型.png')  \n",
    "print('决策树模型.png已经保存在代码所在文件夹！')\n",
    "\n",
    "# 以PDF的形式存储生成的可视化文件\n",
    "os.system('dot -Tpdf dot_data_new.txt -o 决策树模型.pdf')  \n",
    "print('决策树模型.pdf已经保存在代码所在文件夹！')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "此时可以在代码所在文件夹中查看生成的可视化文件"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>#sk-container-id-1 {\n",
       "  /* Definition of color scheme common for light and dark mode */\n",
       "  --sklearn-color-text: black;\n",
       "  --sklearn-color-line: gray;\n",
       "  /* Definition of color scheme for unfitted estimators */\n",
       "  --sklearn-color-unfitted-level-0: #fff5e6;\n",
       "  --sklearn-color-unfitted-level-1: #f6e4d2;\n",
       "  --sklearn-color-unfitted-level-2: #ffe0b3;\n",
       "  --sklearn-color-unfitted-level-3: chocolate;\n",
       "  /* Definition of color scheme for fitted estimators */\n",
       "  --sklearn-color-fitted-level-0: #f0f8ff;\n",
       "  --sklearn-color-fitted-level-1: #d4ebff;\n",
       "  --sklearn-color-fitted-level-2: #b3dbfd;\n",
       "  --sklearn-color-fitted-level-3: cornflowerblue;\n",
       "\n",
       "  /* Specific color for light theme */\n",
       "  --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
       "  --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-icon: #696969;\n",
       "\n",
       "  @media (prefers-color-scheme: dark) {\n",
       "    /* Redefinition of color scheme for dark theme */\n",
       "    --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
       "    --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-icon: #878787;\n",
       "  }\n",
       "}\n",
       "\n",
       "#sk-container-id-1 {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 pre {\n",
       "  padding: 0;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-hidden--visually {\n",
       "  border: 0;\n",
       "  clip: rect(1px 1px 1px 1px);\n",
       "  clip: rect(1px, 1px, 1px, 1px);\n",
       "  height: 1px;\n",
       "  margin: -1px;\n",
       "  overflow: hidden;\n",
       "  padding: 0;\n",
       "  position: absolute;\n",
       "  width: 1px;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-dashed-wrapped {\n",
       "  border: 1px dashed var(--sklearn-color-line);\n",
       "  margin: 0 0.4em 0.5em 0.4em;\n",
       "  box-sizing: border-box;\n",
       "  padding-bottom: 0.4em;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-container {\n",
       "  /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
       "     but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
       "     so we also need the `!important` here to be able to override the\n",
       "     default hidden behavior on the sphinx rendered scikit-learn.org.\n",
       "     See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
       "  display: inline-block !important;\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-text-repr-fallback {\n",
       "  display: none;\n",
       "}\n",
       "\n",
       "div.sk-parallel-item,\n",
       "div.sk-serial,\n",
       "div.sk-item {\n",
       "  /* draw centered vertical line to link estimators */\n",
       "  background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
       "  background-size: 2px 100%;\n",
       "  background-repeat: no-repeat;\n",
       "  background-position: center center;\n",
       "}\n",
       "\n",
       "/* Parallel-specific style estimator block */\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item::after {\n",
       "  content: \"\";\n",
       "  width: 100%;\n",
       "  border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
       "  flex-grow: 1;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel {\n",
       "  display: flex;\n",
       "  align-items: stretch;\n",
       "  justify-content: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:first-child::after {\n",
       "  align-self: flex-end;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:last-child::after {\n",
       "  align-self: flex-start;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:only-child::after {\n",
       "  width: 0;\n",
       "}\n",
       "\n",
       "/* Serial-specific style estimator block */\n",
       "\n",
       "#sk-container-id-1 div.sk-serial {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "  align-items: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  padding-right: 1em;\n",
       "  padding-left: 1em;\n",
       "}\n",
       "\n",
       "\n",
       "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
       "clickable and can be expanded/collapsed.\n",
       "- Pipeline and ColumnTransformer use this feature and define the default style\n",
       "- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
       "*/\n",
       "\n",
       "/* Pipeline and ColumnTransformer style (default) */\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable {\n",
       "  /* Default theme specific background. It is overwritten whether we have a\n",
       "  specific estimator or a Pipeline/ColumnTransformer */\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "/* Toggleable label */\n",
       "#sk-container-id-1 label.sk-toggleable__label {\n",
       "  cursor: pointer;\n",
       "  display: block;\n",
       "  width: 100%;\n",
       "  margin-bottom: 0;\n",
       "  padding: 0.5em;\n",
       "  box-sizing: border-box;\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label-arrow:before {\n",
       "  /* Arrow on the left of the label */\n",
       "  content: \"▸\";\n",
       "  float: left;\n",
       "  margin-right: 0.25em;\n",
       "  color: var(--sklearn-color-icon);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "/* Toggleable content - dropdown */\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content {\n",
       "  max-height: 0;\n",
       "  max-width: 0;\n",
       "  overflow: hidden;\n",
       "  text-align: left;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content pre {\n",
       "  margin: 0.2em;\n",
       "  border-radius: 0.25em;\n",
       "  color: var(--sklearn-color-text);\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content.fitted pre {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
       "  /* Expand drop-down */\n",
       "  max-height: 200px;\n",
       "  max-width: 100%;\n",
       "  overflow: auto;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
       "  content: \"▾\";\n",
       "}\n",
       "\n",
       "/* Pipeline/ColumnTransformer-specific style */\n",
       "\n",
       "#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator-specific style */\n",
       "\n",
       "/* Colorize estimator box */\n",
       "#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label label.sk-toggleable__label,\n",
       "#sk-container-id-1 div.sk-label label {\n",
       "  /* The background is the default theme color */\n",
       "  color: var(--sklearn-color-text-on-default-background);\n",
       "}\n",
       "\n",
       "/* On hover, darken the color of the background */\n",
       "#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "/* Label box, darken color on hover, fitted */\n",
       "#sk-container-id-1 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator label */\n",
       "\n",
       "#sk-container-id-1 div.sk-label label {\n",
       "  font-family: monospace;\n",
       "  font-weight: bold;\n",
       "  display: inline-block;\n",
       "  line-height: 1.2em;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label-container {\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "/* Estimator-specific */\n",
       "#sk-container-id-1 div.sk-estimator {\n",
       "  font-family: monospace;\n",
       "  border: 1px dotted var(--sklearn-color-border-box);\n",
       "  border-radius: 0.25em;\n",
       "  box-sizing: border-box;\n",
       "  margin-bottom: 0.5em;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "/* on hover */\n",
       "#sk-container-id-1 div.sk-estimator:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
       "\n",
       "/* Common style for \"i\" and \"?\" */\n",
       "\n",
       ".sk-estimator-doc-link,\n",
       "a:link.sk-estimator-doc-link,\n",
       "a:visited.sk-estimator-doc-link {\n",
       "  float: right;\n",
       "  font-size: smaller;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1em;\n",
       "  height: 1em;\n",
       "  width: 1em;\n",
       "  text-decoration: none !important;\n",
       "  margin-left: 1ex;\n",
       "  /* unfitted */\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted,\n",
       "a:link.sk-estimator-doc-link.fitted,\n",
       "a:visited.sk-estimator-doc-link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "/* Span, style for the box shown on hovering the info icon */\n",
       ".sk-estimator-doc-link span {\n",
       "  display: none;\n",
       "  z-index: 9999;\n",
       "  position: relative;\n",
       "  font-weight: normal;\n",
       "  right: .2ex;\n",
       "  padding: .5ex;\n",
       "  margin: .5ex;\n",
       "  width: min-content;\n",
       "  min-width: 20ex;\n",
       "  max-width: 50ex;\n",
       "  color: var(--sklearn-color-text);\n",
       "  box-shadow: 2pt 2pt 4pt #999;\n",
       "  /* unfitted */\n",
       "  background: var(--sklearn-color-unfitted-level-0);\n",
       "  border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted span {\n",
       "  /* fitted */\n",
       "  background: var(--sklearn-color-fitted-level-0);\n",
       "  border: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link:hover span {\n",
       "  display: block;\n",
       "}\n",
       "\n",
       "/* \"?\"-specific style due to the `<a>` HTML tag */\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link {\n",
       "  float: right;\n",
       "  font-size: 1rem;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1rem;\n",
       "  height: 1rem;\n",
       "  width: 1rem;\n",
       "  text-decoration: none;\n",
       "  /* unfitted */\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "#sk-container-id-1 a.estimator_doc_link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>SVC()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;&nbsp;SVC<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.4/modules/generated/sklearn.svm.SVC.html\">?<span>Documentation for SVC</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></label><div class=\"sk-toggleable__content fitted\"><pre>SVC()</pre></div> </div></div></div></div>"
      ],
      "text/plain": [
       "SVC()"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn import svm\n",
    "X = [[0, 0], [1, 1]]\n",
    "y = [0, 1]\n",
    "clf = svm.SVC()\n",
    "clf.fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "clf.predict([[2., 2.]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
